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Abstract 

The boundary crossing probability of a Poisson process with n jumps is a fundamental quantity 
with numerous applications. We present a fast 0(n 2 log n ) algorithm to calculate this probability for 
arbitrary upper and lower boundaries. 
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1. Introduction 

Let Xi,..., X n be n i.i.d. random variables drawn from C/[0,1] and let F n be their empirical 
cumulative distribution function, 

F n {t)= l Y j l{X i <t). 

n 

i 

Given two arbitrary functions g , h : [0,1] —► R, we define the corresponding two-sided non-crossing 
probability as 


Pr 


Vf e [0,1] : g{t) < F n {t) < h(t) 


(1) 


This probability plays a fundamental role in a wide range of applications, including the computation 
of p -values and power of sup-type continuous goodness-of-fit statistics (Kolmogorov [1], Steck [2], Noe 
and Vandewiele [3], Noe [4], Durbin [5], Kotel’Nikova and Khmaladze [6], Friedrich and Schellhaas 
[7], Khmaladze and Shinjikaslivili [8]); construction of confidence bands for empirical distribution 
functions (Owen [9], Frey [10], Matthews [11]); change-point detection (Worsley [12]); and sequential 
testing (Dongchu [13]). Note that many of these applications consider a more general case, where 
Xi ,..., X n F for some known continuous distribution F. However, this setting is easily reducible 
to the particular case F = U[ 0,1] by transforming the random variables X,, h>• F(Xi ) and the boundary 
functions as g*(t) = < 7 (F _1 (t)) and h*(t) = /i(F _1 (t)). 

One popular approach is to estimate Eq. (1) using Monte-Carlo methods. In the simplest of these 
methods one repeatedly generates X- t ,.... X n ~ C7[0,1] and counts the number of times that the 
inequalities g(t) < F n (t) < h(t) are satisfied for all t. This approach, however, can be extremely slow 
when the probability of interest is small and the sample size n is large. 

The focus of this paper is on the fast computation of the exact two-sided crossing probability in Eq. 
(1) given arbitrary boundary functions. In the one-sided case (where either g(t) < 0 or h{t) > 1 for all 
0 < t < 1), Eq. (1) can be computed in 0(n 2 ) operations (Noe and Vandewiele [3], Kotel’Nikova and 
Khmaladze [6], Moscovich et al. [14]). Even faster solutions exist for some specialized cases, such as a 
single linear boundary (Durbin [5]). For general boundaries, however, essentially all existing methods 
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require 0(n 3 ) operations (Steck [2], Durbin [15], Noe [4], Friedrich and Schellhaas [7], Khmaladze and 
Shinjikashvili [8]) 1 . 

The main contribution of this paper is the introduction of a fast 0(n 2 logn) algorithm to compute 
the two-sided crossing probability for general boundary functions. This is done by investigating a 
closely related problem involving a Poisson process. Specifically, let £ n (i) ■ [0,1] —> {0,1, 2 ...} be a 
homogeneous Poisson process of intensity n and let g, h : [0,1] —>• K. be two arbitrary boundaries. As 
noted in Section 3, there is a well known reduction from the probability of interest in Eq. (1) to the 
following two-sided non-crossing probability, 

Pr [Vf G [0,1] : g(t) < £ n (t) < h(t) | £„(1) = k] . (2) 

The key observation in this paper, described in Section 2, is that the recursive solution to Eq. (2) 
given by Khmaladze and Shinjikashvili [8] can be described as a series of at most 2 n truncated linear 
convolutions involving vectors of length at most n. Using the Fast Fourier Transform (FFT), each 
convolution can thus be computed in O(nlogn) operations, yielding a total running time of 0(n 2 logn). 

In section 4 we present an application of the proposed method to the computation of p -values for a 
continuous goodness-of-fit statistic. Comparing the run-times of our algorithm to those of Khmaladze 
and Shinjikashvili [8] shows that our method yields significant speedups for large sample sizes. 

Finally, since Brownian motion can be described as a limit of a Poisson process, one may apply 
our method to approximate the boundary crossing probability and first passage time of a Brownian 
motion, see for example Khmaladze and Shinjikashvili [8]. The latter quantity has multiple applications 
in finance and statistics (Siegmund [16], Chicheportiche and Bouchaud [17]). In this case an accurate 
approximation may require a fine discretization of the continuous boundaries, or equivalently a large 
value of n. Hence, fast algorithms are needed. Furthermore, our approach can be extended to higher 
dimensions, where it may be used to quickly approximate various quantities related to Brownian motion 
in 2 or 3 dimensions. 


2. Boundary crossing probability for a Poisson process 

In this section we describe our proposed algorithm for the fast computation of the two-sided non¬ 
crossing probability of a Poisson process, given in Eq. (2). We assume that g(t) < h(t) for all t G [0,1] 
and that g(0) < 0 < h(0), as otherwise the non-crossing probability is simply zero. Also, since the 
Poisson process is monotone, w.l.o.g. the two functions g(t) and h(t) may be assumed to be monotone 
non-decreasing. We start by describing the recursion formula of Khmaladze and Shinjikashvili [8] 
whose direct application yields an 0(n 3 ) algorithm, and then show how to reduce the computational 
cost to 0{n 2 logn) operations. 

For every integer i G [0, < 7 ( 1 )], let = inf{t € [0, 1] : g(t) > 1 } be the first time the function g(t) 
passes the integer i. Similarly for every integer i G [ft.(0), h(l)], let = sup{f G [0,1] : h(t) < i} be the 
last time the function h(t) is bounded by i. Let T(g) = {tf}o<i<g(i) and T(h) = {t 1 ?}h(o)<i<h(i) be the 
set of all integer crossing times for the two functions. As illustrated in Figure 1, a non-decreasing step 
function / : [0,1] — > {0,1, 2,...} satisfies g(t) < f(t) < h(t) for all t G [0,1] if and only if it satisfies these 
conditions at all discrete times t G T{g) U T(h) U {1}. Hence, to compute the probabilities in equations 
(1), and (2), it suffices to analyze these inequalities only at a finite set of TV = | T(g) U T{h) U {1}| 
times. 

Definition 1. Let £ n (i) denote a one-dimensional Poisson process with intensity n. For any s G [0, 1] 
and to G {0,1,2,..., n}, define Q(s , to) as the probability that £ n (s) = to and that does not cross 


1 The procedure of Steck [2] is based on the computation of an n X n matrix determinant and Durbin [15] is based 
on solving a system of linear equations. Theoretically, using the Coppersmith-Winograd fast matrix multiplication 

algorithm, both methods yield an 0(n 2 373 ) solution. However this method is never used in practice because of the huge 

constant factors involved. 
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Figure 1: A two-sided boundary and a scaled empirical distribution nF n (t) of n = 5 samples. In this 
illustration nF n happens to cross the upper boundary function h(t). Empty circles mark the integer crossing 
points of g(t), h(t) and determine discrete times ti < ... < tjv = 1 which correspond to layers of a transition 
graph. Note that nF n (t) crosses one of the boundaries if and only if it intersects an empty circle. 


the boundaries g{t),h(t) up to time s. i.e. 

Q(s, m) := Pr [Vt € [0, s] : g(t) < £„(f) < h(t) and £ n (s) = m]. 

Of particular interest are the values Q(l,m) which correspond to Poisson processes that never cross 
the boundaries. Clearly Q(0, 0) = 1 and Vm > 0 : Q(0,m) = 0. Let t\ < ... < tjv = 1 denote the 
sorted set of times from T(g) U T(h ) U {1}. For any i £ {0,N — 1} and any m € {0,1,2,...} the 
Chapman-Kolmogorov equations give 


rt . . _ ■Pr[Z i =m~ £} if g(t i+ i) < m < h(t i+1 ) 

I 0 otherwise 

where Z r is a Poisson random variable with intensity n(U+\ — U) and the sum is taken over all 
g(U) < i < m. This formula was proposed by Khmaladze and Shinjikashvili [8] in order to compute 
<5(1, n). All quantities up to the final time £jv = 1 can be computed recursively as follows: first calculate 
explicitly the probabilities Q(ti, 0),..., n) at time t 1 . Next, calculate all probabilities at time 
ti +1 using the quantities from time ti, and so on. Since each Q{U, m) is a sum of up to m + 1 < n + 1 
terms and since N < 2n + 1. the total run-time is at most 0(n 3 ), but may be smaller if the boundary 
functions g{t),h(t) are close to each other. 

Next, we describe a faster procedure. Let Q ti = {Q{U, 0), Q{U, 1),..., Q(U, n)) and let tt\ = 
(Pr [Z = 0], Pr [Z = 1],..., Pr [Z = n]), where Z ~ Poisson(A). The key observation is that the vector 
Qt i+1 in Eq. (3) is nothing but a truncated linear convolution of the vectors Qti an d 7r n ( t . +1 _ t .). Hence 
we may apply the circular convolution theorem to compute it in the following fashion: 

1. Append n zeros to the end of the two vectors Qt , and 7r„( t . +1 _ ti ), denoting the resulting vectors 
Q 2n and n 2n respectively. 

2. Compute the Fourier transform of the zero-extended vectors F{Q 2n } and F{r: 2n }. 

3. Use the convolution theorem to obtain the Fourier transform of the convolution, 

C 2n = F{Q 2n * 7T 2 "} = F{Q 2n } ■ F{Ti 2n }, 
where * denotes cyclic convolution and • denotes pointwise multiplication. 
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4. Compute the inverse Fourier transform of C 2n to yield the vector Qt i+1 


n ( \ jJ 7 1 {C 2n }{m) if g{t i+1 ) < m < h(t i+1 ) 
ft »' (m) = \o otherwise. 

Using the FFT algorithm, each Fourier Transform takes 0(nlogn) time. Repeating these four steps 
for all times f £ T{g) U T(h) U {1} yields a worst-case total run-time of 0{n 2 logn). However, it may 
be much lower if the functions g(t) and hit) are close to each other. For more details on the FFT and 
the computation of discrete convolutions, we refer the reader to Press et al. [18, Chapters 12, 13]. 


3. Boundary crossing probability for the empirical CDF 

We now return to the problem of calculating the probability in Eq. (1), that an empirical CDF 
will cross prescribed upper and lower boundaries. To simplify notation, we look at the scaled function 
nF n {t ) instead of F n {t ), and similarly to the previous section, consider the probabilities 


R(s,m) = Pr Vf £ [0, s] : g(t) < nF n (t) < h{t) and nF n (t) 



Let 0 = to < ti < ■ ■ ■ < tN = 1 be as before, and let 

Ze t i ~ Binomial (n — £, ti fff ti ^ . 

The Chapman-Kolmogorov equations give the recursive relations of Friedrich and Schellhaas [7] 

j^ e R(t i ,£)-PT[Zi ii = m-£] if g(t i+1 ) <m< h(t i+1 ) 

H'V'i+li'™') — a 

0 otherwise. 


( 4 ) 


In contrast to Eq. (3), the expression for Rt i+1 , the vector of probabilities at time ti+i, is not in the 
form of a straightforward convolution, and hence cannot be directly computed using the FFT. While 
not the focus of our work, we note that by some algebraic manipulations, it is possible to compute Eq. 
(4) using a convolution and an additional 0{n ) operations. Instead, we present a simpler construction 
that builds upon the results of the previous section. To this end we recall a well-known reduction from 
the empirical CDF to the Poisson process (Shorack and Wellner [19, Chapter 8 , Proposition 2.2]): 

Lemma 1. The distribution of the process nF n {t) is identical to that of a Poisson process £ n {t) with 
intensity n, conditioned on £ n (l) = n. 

According to this lemma, the non-crossing probability of an empirical CDF can be efficiently 
computed by a reduction to the Poisson case, since 


Pr Wt : g(t ) < nF n (t) < h(t ) = Pr [Vf : g(t) < £„(t) < h(t) |£„(1) = n] 
_ Pr [Vf : 5 (f) < £„(f) < h(t ) and £„(1) =n] Q(n,n) 


( 5 ) 


Pr [Poisson(n) = n] 


n ,l e~ 


' l /n\ 


and Q(n,n ) can be computed efficiently, as described in Section 2. 


4. Computing p-values for goodness-of-fit statistics 

The results of the previous sections can be used to compute the p -value of several two-sided contin¬ 
uous goodness-of-fit statistics such as Kolmogorov-Smirnov, and their power against specific alterna¬ 
tives. Our algorithm may also be applied to one-sided statistics such as the Higher-Criticism statistic 
of Donoho and Jin [20]. 
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To this end, recall the setup in the classical continuous goodness-of-fit testing problem. Let 
X\, X2, ■ ■ ■, x n be n real-valued samples. We wish to assess the validity of a null hypothesis that 
Xi,...,x n are sampled i.i.d from a known (and fully specified) continuous distribution function F 
against an unknown and arbitrary alternative G, 

Ho : Xi * F vs. Hi : Xi *~ d ' G with G ^ F. 


Let Ui = F(xi) be the probability integral transform of the i-th sample, and ii(i) < it( 2 ) < ... < ut n ) be 
the sorted sequence of transformed samples. Under the null hypothesis, each iq is uniformly distributed 
in [0,1] and therefore Uu is the *-th order statistic of a uniform distribution. 

A common approach to goodness-of-fit testing is to measure the distance of the different order 
statistics from their expectation under the null. A classical example is the Kolmogorov-Smirnov 
statistic K n := ma x{K~ ,K+}, where K~ and iL+ are the one-sided KS statistics, defined as 


K n = \Jn max 

,n 




K+ = \fn max 
2 = 1 ,...,n 



More generally, given a sequence of monotone-increasing functions rq,..., r n : R —> ffi. and a sequence 
of decreasing functions Si,..., s n : R —> K, one may define one-sided goodness-of-fit statistics by 


R := max r,(u(p) and S := max Si(u(i)) 
2 = 1 ,...,n v ' 2 = 1 ,...,n v y 

and a two-sided statistic by 


( 6 ) 


T := max{i?, S}. (7) 

Statistics of this form include the supremum Anderson-Darling statistic and other weighted Kolmogorov- 
Smirnov statistics [1, 21], the R n statistic of Berk and Jones [22] and Phi-divergence supremum statis¬ 
tics [23]. Similarly, the one-sided Higher Criticism statistic of Donoho and Jin [20] and its variants 
follow the form of the one-sided statistic S in Eq. ( 6 ). 

It is easy to verify that T < t if and only if s“ 1 (t) < Uu\ < r“ 1 (t) holds for all i. Therefore, the 
p -value of T = t is equal to 

Pr [T > t\H 0 \ = 1 - Pr [VI < * < n : sf^t) < U (i) < rf^i)] , ( 8 ) 

where t/( 1 ),..., £/(„) are the order statistics of n draws from t/[0,1]. Define two functions gt(x) and 
h t (x ) as follows, 

n n 

9t(x ) = < x), h t {x) = Y < x), 

2=1 2=1 

then the probability of Eq. ( 8 ) is equal to that of Eq. (5) which we can compute in time 0{n 2 logn). 
4-1. Simulation Results 

We evaluate the empirical run-time of our procedure for computing p-values of the two-sided M n 
and one-sided M+ goodness-of-fit statistics of Berk and Jones [22]. These statistics have the form of 
equations ( 6 ) and (7) but with a minimum instead of a maximum (see [14, Section 3]). 

To this end we wrote an efficient implementation of the proposed procedure using the FFTW3 
library by Frigo and Johnson [24] and compared it to a direct implementation of the Khmaladze and 
Shinjikashvili [ 8 ] recursion relations (denoted ”KS 2001”). In addition, we implemented the 0{n 2 ) 
one-sided algorithm of Moscovich et al. [14] (denoted ”MNS 2016”). We find that both two-sided 
procedures are numerically stable using standard double-precision (64-bit) floating point numbers, 
even for sample sizes as large as n = 250, 000. In contrast, the one-sided procedure [14] requires a 
careful numerical implementation using extended-precision (80-bit) floating point numbers and breaks 
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Two-sided 


One-sided 



Figure 2: Runtime comparison of our algorithm compared to that of Khmaladze and Shinjikashvili [8] (KS 
2001) and to the one-sided method described in Moscovich et al. [14] (MNS 2016). The boundaries were 
chosen such that the p -value of M n , equal to its two-sided boundary crossing probability, is 5%. Note that the 
one-sided case is much slower to compute. 


down completely for sample sizes n > 50,000. Figure 2 presents a runtime comparison of the three 
algorithms for computing one-sided and two-sided crossing probabilities 2 . 

Somewhat counter-intuitively, the one-sided case is much more expensive than the two-sided case. 
This is made clear by examining Eq. (3) and noting that in the one-sided case the variable m has 
a large valid range averaging around n/2 , whereas in the two-sided case this range is typically much 
smaller. In all cases, our procedure is the fastest of all 3 methods. Surprisingly, this is true even in 
the one-sided case where the 0{n 2 ) procedure of Moscovich et al. [14] is asymptotically superior. 

Finally, we note that for large sample sizes, one may be inclined to forgo exact computation of p- 
values and instead use the asymptotic null distribution of the particular test statistic in use (assuming 
it is known). However, this does not always provide an adequate approximation, particularly as in 
several cases the finite sample distribution converges very slowly to its limiting form. Depending on 
the application, even the currently best known approximations may not be sufficiently accurate. For 
more on this topic, see Li and Siegmund [25]. 
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